Walker Lake ¶
Right click to download this notebook from GitHub.
While the loss of the Aral Sea in Kazakhstan and Lake Urmia in Iran have received a lot of attention over the last few decades, this trend is a global phenomena. Reciently a number of papers have been published including one focusing on the Decline of the world's saline lakes . Many of these lakes have lost the majority of their volume over the last century, including Walker Lake (Nevada, USA) which has lost 90 percent of its volume over the last 100 years.
The following example is intended to replicate the typical processing required in change detection studies similar to the Decline of the world's saline lakes .
import intake
import numpy as np
import xarray as xr
import holoviews as hv
import geoviews as gv
import datashader as ds
import cartopy.crs as ccrs
import pandas as pd
import glob
from colorcet import coolwarm
from holoviews.operation.datashader import rasterize, regrid, shade
hv.extension('bokeh', width=80)
# arbitrarily choose a small memory limit (4GB) to stress the
# out of core processing infrastructure
from dask.distributed import Client
client = Client(memory_limit=10e10, processes=False) # Note: was 6e9
client
Landsat Image Data ¶
To replicate this study, we first have to obtain the data from primary sources. The conventional way to obtain Landsat image data is to download it through USGS's EarthExplorer or NASA's Giovanni , but to facilitate the example two images have been downloaded from EarthExployer and cached.
The two images used by the original study are LT05_L1TP_042033_19881022_20161001_01_T1 and LC08_L1TP_042033_20171022_20171107_01_T1 from 1988/10/22 and 2017/10/22 respectivly. These images contain Landsat Surface Reflectance Level-2 Science Product images.
Loading into xarray via
intake
¶
In the next cell, we load the Landsat-5 files into a single xarray
DataArray
using
intake
. Data sources and caching parameters are specified in a catalog file. Intake is optional, since any other method of creating an
xarray.DataArray
object would work here as well, but it makes it simpler to work with remote datasets while caching them locally.
cat = intake.open_catalog('../catalog.yml')
l5 = cat.l5()
L5_img = l5.read_chunked()
L5_img
l8 = cat.l8()
L8_img = l8.read_chunked()
L8_img
Now let us view some metadata about this
DataArray
:
print("The shape of the DataArray is :", L5_img.shape)
print("With attributes:\n ", '\n '.join('%s=%s'%(k,v) for k,v in L5_img.attrs.items()))
We can use this EPSG value shown above under the
crs
key to create a cartopy coordinate reference system that we will be using later on in this notebook:
crs=ccrs.epsg(32611)
L5_img.data[L5_img.data==-9999] = np.NaN # Replace the -9999
ndvi5_array = (L5_img[4]-L5_img[3])/(L5_img[4]+L5_img[3])
ndvi5 = ndvi5_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]
client.persist(ndvi5)
Computing the NDVI (2017) ¶
Now we can do this for the Landsat 8 files for the 2017 image:
L8_img.data[L8_img.data==-9999] = np.NaN # Replace the -9999
ndvi8_array = (L8_img[4]-L8_img[3])/(L8_img[4]+L8_img[3])
ndvi8 = ndvi8_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]
client.persist(ndvi8)
Viewing change via dropdown ¶
Using
datashader
together with
geoviews
, we can now easily build an interactive visualization where we select between the 1988 and 2017 images. The use of datashader allows these images to be dynamically updated according to zoom level (Note: it can take datashader a minute to 'warm up' before it becomes fully interactive). For more information on how the dropdown widget was created using
HoloMap
, please refer to the
HoloMap reference
.
%opts Image (cmap='viridis') [width=450 height=450 tools=['hover'] colorbar=True]
hmap = hv.HoloMap({'1988':gv.Image(ndvi5, crs=crs, vdims=['ndvi'], rtol=10),
'2017':gv.Image(ndvi8, crs=crs, vdims=['ndvi'], rtol=10)},
kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'
rasterize(hmap)
Computing statistics and projecting display ¶
The rest of the notebook shows how statistical operations can reduce the dimensionality of the data that may be used to compute new features that may be used as part of an ML pipeline.
The mean and sum over the two time points ¶
The next plot (may take a minute to compute) shows the mean of the two NDVI images next to the sum of them:
mean_avg = hmap.collapse(dimensions=['Year'], function=np.mean)
mean_img = gv.Image(mean_avg.data, crs=crs, kdims=['lon', 'lat'],
vdims=['ndvi']).relabel('Mean over Year')
summed = hmap.collapse(dimensions=['Year'], function=np.sum)
summed_image = gv.Image(summed.data, crs=crs, kdims=['lon', 'lat'],
vdims=['ndvi']).relabel('Sum over Year')
If you are getting a bunch of warnings, then it is possible that your data are on a different grid. We can check whether
summed.data
is all null. If it is, then we'll need to regrid.
if summed.data.ndvi.isnull().all():
res = 100
x = np.arange(min(ndvi5.x.min(), ndvi8.x.min()), max(ndvi5.x.max(), ndvi8.x.max()), res)
y = np.arange(min(ndvi5.y.min(), ndvi8.y.min()), max(ndvi5.y.max(), ndvi8.y.max()), res)
hmap = hv.HoloMap({'1988':gv.Image(ndvi5.interp(x=x, y=y), crs=crs, vdims=['ndvi']),
'2017':gv.Image(ndvi8.interp(x=x, y=y), crs=crs, vdims=['ndvi'])},
kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'
mean_avg = hmap.collapse(dimensions=['Year'], function=np.mean)
mean_img = gv.Image(mean_avg.data, crs=crs, kdims=['lon', 'lat'],
vdims=['ndvi']).relabel('Mean over Year')
summed = hmap.collapse(dimensions=['Year'], function=np.sum)
summed_image = gv.Image(summed.data, crs=crs, kdims=['lon', 'lat'],
vdims=['ndvi']).relabel('Sum over Year')
rasterize(mean_img) + rasterize(summed_image)
Difference in NDVI between 1988 and 2017 ¶
The change in Walker Lake as viewed using the NDVI can be shown by subtracting the NDVI recorded in 1988 from the NDVI recorded in 2017:
diff = np.subtract(hmap['1988'].data, hmap['2017'].data)
difference = gv.Image(diff, crs=crs, kdims=['lon', 'lat'], vdims=['ndvi'])
difference = difference.relabel('Difference in NDVI').redim(ndvi='delta_ndvi')
rasterize(difference).redim.range(delta_ndvi=(-1.0,1.0)).options(cmap=coolwarm)
You can see a large change (positive delta) in the areas where there is water, indicating a reduction in the size of the lake over this time period.
Slicing across
lon
and
lat
¶
As a final example, we can use the
sample
method to slice across the difference in NDVI along (roughly) the midpoint of the latitude and the midpoint of the longitude. To do this, we define the following helper function to convert latitude/longitude into the appropriate coordinate value used by the
DataSet
:
def from_lon_lat(x,y):
return crs.transform_point(x,y, ccrs.PlateCarree())
%%opts Curve [width=600 tools=['hover']]
lon_y, lat_x = from_lon_lat(-118, 39) # Longitude of -118 and Latitude of 39
(difference.sample(lat=lat_x) + difference.sample(lon=lon_y)).cols(1)